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1. Introduction 



1...^ ^ An experiment is not completed when the experimental data are collected. Usually, 

_^ ■ the data require some processing, an analysis, and possibly some sort of visualization. 

[»««, I The three most commonly encountered approaches to data analysis and graphing are 



manual method using grid paper 
general purpose spreadsheet software 



f^ ' • statistics software suite 

The manual method involving grid paper and a pencil often still seems to be the 
one favoured by the teachers. Indeed it has a certain pedagogical merit, for it docs not 
hide anything in black boxes: the student has to work through all the steps him- or 
^^ . herself. It is also the preferred option in the environments where students cannot be 

V^ ' expected to own a personal computer. However, an increasing proportion of students 

does have access to a computer, either on their own initiative or as part of an organized 
initiative {e.g., [1]), and they find the method tedious, painstaking and old-fashioned. 
In all honesty, their teachers do not use this method in their own research work. 

General purpose spreadsheet software like Excel (Microsoft), Quattro Pro 
(formerly Borland, Inc. now Corel, Inc.), or the freely available OpenOffice Calc 
(formerly Sun Microsystems, Inc., now Oracle Corporation) contain most functions 
required for data analysis and graphing. They are commonly available and the 
students are generally well- versed in their use. However, spreadsheets have their 
drawbacks. They are difficult to debug, and they often require an educated operator 
in order to produce a visually satisfying graph without visual distractions. 
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Due to volume academic licenses, statistical software suites like SPSS (formerly 
SPSSJnc, now IBM), SAS (SAS Institute, Inc.), Stata (StataCorp, Inc.), Statistica 
(StatSoft, Inc.), Minitab (Minitab, Inc.), or S-PLUS (TIBCO Software, Inc.) have 
also become a viable option. They usually allow both a spreadsheet-like access and 
programming using a scripting language, and generally produce an excellent graphical 
output. 

Finally, one also needs to mention the fourth group: specialized software 
for scientific graphing such as Origin (formerly MicroCal, Inc., now OriginLab), 
SigmaPlot (formerly Jandel Scientific, now Aspire Software International), IGOR Pro 
(Wavemetrics, Inc.), or Prism (GraphPad Software). This is usually the preferred 
option in a research laboratory, but with a price of around 1000 EUR per license, it 
is usually considered too expensive for a classroom or student laboratory use. 

In this paper, we illustrate two methods, one involving a spreadsheet and another 
one a statistics software suite, on a set of three problems taken from the laboratory 
course prepared for first year students of medicine, dental medicine and veterinary 
medicine |2] . Two of them involve bivariate data (illustrating a linear and a non- linear 
dependence) and one involves univariate data (a histogram of radioactive decay). As 
typical examples of each class, two programs were selected: Microsoft Excel (version 
12, included in Microsoft Office 2007) and R (version 2.10.1). Even though one is a 
commercial software product and the other is a free one, we consider them the most 
popular representatives of their respective groups, which justifies their comparison. 
Furthermore, the features presented in R work unchanged in S-Plus, the commercial 
implementation of S, while the spreadsheet examples work, except where noted, 
besides in Excel also in the freely available OpenOffice Calc. 

Microsoft first introduced Excel for Apple Macintosh in 1985, while the version 
for Microsoft Windows followed two years later. It started to outsell its then main 
competitor, Lotus 1-2-3, as early as 1988, and the introduction of Microsoft Windows 
3.0 in 1990 cemented its position as the leading spreadsheet product with a graphical 
user interface (GUI). Microsoft Excel is now available for Microsoft Windows and the 
MacOS X environments. 

R [3J 13] is both a language and an environment for data manipulation, statistical 
computing and scientific graphing. R is founded on the S language and environment [5] 
which was developed at Bell Laboratories (formerly AT&T, now Lucent Technologies) 
by John Chambers, Richard Becker and coworkers. S strove to provide an interactive 
environment for statistics, data manipulation and graphics. Throughout the years, S 
evolved into a powerful object-oriented language [T]- Nowadays, two descendants 
exist which build on its legacy: S-Plus, a commercial package developed by 
Tibco Software, Inc., and the freely available GNU R. Its primary source is the 
Comprehensive R Archive Network (GRAN), http://cran.r-project.org/ 

In the rest of the paper, we first compare Excel and R on three cases taken from a 
introductory physics laboratory: a linear and a non-linear regression and a histogram. 
We then discuss the positive and the negative aspects of using either approach, and 
finally present the main conclusions. 
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Figure 1. Using the "Add trendline" feature, linear dependence between the 
electrolyte concentration and its conductivity in a dilute electrolyte is easily 
determined with a spreadsheet. 
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2. Comparison 

2.1. Case 1: Linear dependence 

Let us start with a simple example, in which the linear dependence between the 
concentration and the conductivity of a dilute electrolyte is examined. At 5 different 
concentrations of electrolyte, the student takes measurement of the resistance of a 
beaker filled with electrolyte to the mark, with a pair of electrodes immersed and 
connected into a Wheatstone bridge. Using one known value for conductivity, the 
student calculates the rest of conductivities from the resistances, plots the points into 
a scatterplot, fits the best line through the data points, and, measuring its resistance, 
finally determines the unknown concentration of an additional sample. 

The spreadsheet solution is straightforward: data points can be plotted as a 
scatterplot, then with a right-hand click on any data point, "Add trendline" is selected 
(figure [T]). In order to learn the slope and the intercept of the straight line, one can 
choose "Display equation on chart" in the "Options" tab. Alternatively, one can 
find the slope, the intercept and the correlation coefficient through built-in functions 
SLOPE 0, INTERCEPTO, and CORRELO. More statistical parameters can be obtained 
through a matrix function LINESTO. Individual cells from within the result of a 
matrix function can be extracted by embedding LINESTO inside a INDEX () function. 
A fully worked example is given in the on-line Supplementary Material. 

In R, the task can be achieved with a simple script, shown in Appendix 
[Appendix A| The resulting graph is shown in figure [2] The example shows a few 
features of R which we want to comment on: 

• Since we only have five data points, the data were entered directly into the 
program rather than being read from an external data file. The c() function 
is used to concatenate data into a vector. 

• R encourages programming with vectors. In the expression k/resistance, each 
element of the resulting vector is computed as a reciprocal value of the element 
of the original vector, multiplied by k. 

• Inspired by the TffpC typesetting system, R offers a capable method of entering 
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Figure 2. Linear dependence between tiie electrolyte concentration and its 
conductivity in a dilute electrolyte. 



mathematical expressions into graph labels using the expressionO function [8]. 

• The plotO function is the generic function for plotting objects in R; here we 
used it to produce a scatterplot. 

• The lm() function is used to fit any of several linear models [9]; we used it to 
perform a simple bivariate regression. R is not overly talkative, and lm() does 
not produce any output on screen, it merely creates the object cond.fit, which 
we can later manipulate at will. In the example we used abline(cond.f it) to 
plot the regression line atop the data points. If we want to print the coefficients, 
we can use coef (cond.fit) . 

In a simple case like linear dependence, spreadsheet offers a somewhat simpler 
solution. R, however, gives more control over its graphical output and generally 
produces a visually more pleasing output. 



2.2. Case 2: Nonlinear dependence 

Not all dependencies are linear. Sometimes, they can be linearized by an appropriate 
transformation {e.g., logarithmic), which reduces the task of finding the optimal fitted 
curve back to the linear case. With a computer at hand, however, this is not necessary. 
In this example, we examine the time dependence of voltage in a circuit with two 
capacitors (figure [3|) . The student first charges the capacitor Ci and then monitors 
the voltage as Ci discharges. The voltage U{t) can be written as a sum of two 
exponentials: 

[/(t) = yle-*/^i+Be-*/^^ . (1) 

The task is to determine the two amplitudes, A and B, and both relaxation times, ti 
and T2. 

Solving the problem with a spreadsheet, one first learns that the otherwise 
convenient "Add trendline" feature cannot be applied to this case — while it offers a 
few functions beyond linear (polynomial, logarithmic, exponential, power and running 
average), a sum of several exponential functions is not among them. Instead, we 
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Figure 3. A circuit scheme with two capacitors. In the experimental setup, 
C\ = C2 and R2 ^ Ri. C2 is discharged at the beginning of the experiment. 
The voltage U on Ci is recorded as a function of time t. 
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Figure 4. Logarithm of the dimensionless voltage u = U/1 V on the capacitor 
Ci (figure |3J as a function of time t; after fs 200 s the fast component dies away 
and only the slow component remains, yielding B and T2. 



present two other solutions in Excel. The first one emulates the manual graphical 
method, while the second one uses the built-in "Solver" tool. 

The first solution (Supplementary Material, worksheet "2exp log") takes into 
account the properties of the exponential function. After a certain period — this 
threshold value can be most easily obtained graphically (figure |3]) by plotting In U 
(Supplementary Material, supplementl.xls, worksheet "2exp log", column C) vs. 
t — the fast component, Ae~*''^^, dies away, and only the slow component remains: 

C/(i)«5e"*/^^ . (2) 

Taking logarithm of ^ one obtains In B as the intercept (D13) and — I/T2 as the slope 
(D12). T2 = 410 s and i? = 8.0 V are computed in D16 and D17, respectively. Once 
the slow component is determined, one can compute the fast component (column F) 
by subtracting the slow component from the whole: 

Uh{t)^U{t)-Bc-'/^^ . (3) 

From (ID) one can see that Uh{t) — Ae^*^'^'-, therefore by taking the logarithm of 
(O, one obtains In A as the intercept (H13) and — 1/ti as the slope (H12), yielding 
A = 3.7 V and ti — 43 s. This solution retraces the same steps one would take if 
equipped only with a grid paper and a pocket calculator; spreadsheet only makes it 
slightly more convenient. 

While the linearized solution certainly possess certain pedagogical merits, one 
needs to realize that the transformation needed to linearize the data inevitably distorts 
the experimental error and alters the relationship between the x- and y- values, yielding 
an incorrect final result. Instead, one can use the built-in "Solver" tool (the "Solver" 
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included in the recent (3.1.1) release of OpenOffice Calc is restricted to linear models 
and thus inappropriate for the task) to maximize the coefficient of determination, 

"" ^' T.{y-W ■ ^'^ 

The spreadsheet solution is given in the Supplementary Material (supplementl.xls, 
worksheet "2exp nonlin"). As before, column A contains the independent variable 
(t) and column B the dependent variable (C/). Column C contains the model curve 
([1]), with the cells J3. . . J6 containing the coefficients A, B, n and T2, respectively. 
Column D contains the squared difference between the experimental values of the 
dependent variable and the model curve, and column E the squared difference between 
the individual values of the dependent variable and their average value, the average 
value itself being saved in the cell J7. Finally, the cell JIO contains the coefficient of 
determination Q. 

We fill the cells J3. . . J6 with the initial guesses for the parameters and start the 
Solver tool (Tools — >■ Solver) . We want to maximize the coefficient of determination 
[TU] . therefore we set JIO as our target cell, and set the "Equal to" option to "Max". 
We enter the range J3:J6 into "By changing cells" option and start the optimizer by 
clicking "Solve". After the solver has finished, we are asked whether we want the 
optimized values of parameters to be written into the given cell range J3:J6. 

The solution in R is shown in Appendix [Appendix B[ It assumes that the data 
data is saved in a CSV format: 

"t [s] " ; "U [V] " 

2,3;12,0 

8,7;11,0 

The resulting graph is shown in figure [S) A few features of R used in the example 
need some comment: 

• read.tableO returns the object of a class data.frsmie; we may visualize it as a 
table. Individual columns in the table are addressed as tabled column. Column 
names are constructed from the table header, with all illegal characters being 
replaced by dots. For easier manipulation, two vectors, t and U, were created 
from the appropriate columns of the table. 

• The actual nonlinear fitting is performed by the nlsO function, which returns 
an object of the nls class. Apart from the formula, we also supplied the initial 
guesses for the fitting parameters. The values of the coefficients can be extracted 
by coefCU. nls), yielding A == 4.0 V, B = 8.2 V, n = 35 s, and T2 = 403 s. The 
command summary (U. nls) produces an even more informative report. 

• An object of an nls class also provides methods for the predict () function, 
which we used to plot a piecewise linear curve, which appears as a smooth curve 
due to the small step used. 

In this case, the solution in R does not appear any more complex than the linear 
case — the considerably more difficult mathematics of the non-linear regression are 
hidden from the user. On the other hand, a spreadsheet like Excel has no direct 
support for non-linear regression, and subsequently the user has to work through 
some of the necessary steps manually. 
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Figure 5. The voltage U on the capacitor Ci (figure [Sjl as a function of time t; 
measured data (o) and a fitted curve. 



2.3. Case 3: Histogram 

Our third exaniple exaniines the statistics of radioactive decay. Using a Geiger counter, 
the student is required to record the number of decays detected in a period of time 
(in our case, 10 s), and repeat the measurement 100 times. After subtracting the base 
activity, the student computes the mean value x, the standard deviation Gx , and plots 
a histogram. 

In a histogram, one plots the frequencies of the cases falling into each of several 
categories, where the categories are usually specified as non-overlapping intervals of 
the variable in question. In a spreadsheet, the FREQUENCY () function can be used to 
compute the frequencies. This is a matrix functions, which takes two ranges of cells 
as input — the data array and the bins array — and returns as output an array of the 
same length as the bins array. While calculating the frequencies is supported, neither 
Excel nor OpenOfRce Calc provide a direct support for histogram plotting. Since the 
bins were chosen to be of the same size in our example, we can emulate them with 
bar charts (figure |6]) . While one can eliminate the gaps between bars in the chart 
setting, label positions reveal the fact that we are dealing with a bar chart rather than 
a histogram. 

The script shown in Appendix [Appendix C demonstrates computing and plotting 
the histogram in R. The final result is shown in figure [T] Again, a few comments. 

• Since the format of the input data file is very simple (one figure — the number of 
detected decays per 10 s — per line), we have used the scanO function instead of 
read. table ; x is thus a vector rather than a data frame. 

• The histogram is plotted using the histO function. The breaks option is used 
to specify bin size and boundaries. By default, R uses Sturges' formula for 
distributing n samples into k bins: 

k = [log2 n + 11 . 

• The normal probability distribution function dnormO is plotted over the 
histogram with the curve ( ) function and add = TRUE option. 
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Figure 6. Lacking the support for histograms, one needs to resort to bar charts 
to plot a histogram of radioactive decay with a spreadsheet. 
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Figure 7. Experimentally obtained histogram of radioactive decay and a normal 
distribution {x = 5.61, a^ = 2.60) plotted on top of it. 



With a direct support for histograms, R is clearly the preferred solution over a 
spreadsheet in this case, a fact which would become even more apparent if the nature 
of the problem would require histograms with unequal bin widths. 

3. Discussion 



The amount of information a working physicist — or most any scientist — has to cope 
with often exceeds the human capacity of processing in a tabular form and requires 
some visualization technique. A certain degree of graphic, or visual literacy is therefore 
required. As a result, a fair amount of emphasis is given to teaching this skill in the 
course of a physics curriculum |lll I12[ 113] . Producing graphs manually involves some 
processing of data with a pocket calculator, determining the minimal and the maximal 
values of data in x- and y direction, choosing the appropriate scales on the x- and 
j/-axis, plotting the data points, drawing the best-fit line and possibly using it in the 
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subsequent analysis. In the case of histogram plotting, the student has to choose 
the bin size and manually arrange the samples into bins. It is instructive to perform 
this whole process once or possibly a few times in order to get familiar with all the 
steps required. However, we can not overlook the fact that the processes involved in a 
manual production of a graph are time-consuming, error-prone, and do not contribute 
significantly to a better understanding of the problem studied. In this respect, we 
have to agree with earlier studies [14] that if the purpose of the graph is data analysis 
rather than acquiring the skill of plotting a graph, the time spent with manual graph 
plotting is better spent discussing its meaning. 

By employing the "variable equals spreadsheet cell" paradigm where users can 
see their variables and their contents on screen, spreadsheets have been extremely 
successful in bringing data processing power to the profile of users who would have 
otherwise not embraced a traditional approach to computer programming. The 
usefulness of spreadsheets in education in general (c/. [TS] and the references therein) 
and for teaching physics in particular [111 HZl HH] , including laboratory courses [TOll^ . 
has been recognized early on. 

However, the approach taken by spreadsheets also has its drawbacks. By placing 
emphasis on visualizing the data itself, the relations between data are less emphasized 
than in traditional computer programs, and it is generally more difficult to debug 
a spreadsheet than a traditional computer program of comparable complexity (c/. 
[2T| and the references therein). The fact that the variables are referred to by their 
grid address rather than by some more meaningful name is an additional hindrance 
factor. Leaving aside the questioned validity of some of statistics functions built into 
Excel [22l |23l EH ES], which is outside the scope of this paper, we wish to deal with 
another sort of criticism concerning the graphical output of spreadsheet programs. 
While a skilled user can produce clear, precise and efficient graphs with Excel (c/. 
[26] and the references therein), it is perceived [27] that less skilled users are easily 
misled into excessive use of various embellishments (dubbed as "chartjunk", [28]) that 
do not add to the information content. Telling good graphs from bad ones is not 
merely a subject of aesthetics, there exists an extensive body of work in this area (c/. 

Where does R stand in comparison? As expected from a statistics suite, R 
outperforms Excel in terms of statistical accuracy [34]. R also does a much better 
job at graphing. Except for the obvious required modifications like axes labels, the 
default settings are often already satisfactory. We have nevertheless demonstrated in 
figure[2]how features like grid lines can be added, should they be considered necessary. 
Excel charts, on the other hand, come by default with an obtrusive grey background 
and horizontal grid lines (as seen in figure |6]), which do not convey any information 
( "non-data ink" , in Edward Tufte's terminology). In terms of ease-of-use. Excel excels 
if the relationship between the data assumes any of the four pre-defined possibilities: 
linear, polynomial, exponential or logarithmic. Aside from these, more or less clever 
techniques need to be adopted in order to use Excel for data analysis, most of them 
being too complex for the students to be expected to invent them on their own. In R, 
even the linear modelling requires a few lines of code. On the other hand, however, 
non-linear modelling is hardly any more complex than linear modelling for the end 
user, and is considerably easier than in Excel. The same is also true for plotting 
histograms. 

The potential of R in a classroom environment has already been examined in 
various disciplines, e.g., statistics [35], econometrics [36], and computational biology 
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|37| . The extent of its use in student laboratories throughout the world is, however, 
hard to estimate. On the other hand, we can observe that R receives an explosive 
growth of use in research laboratories, where one would actually expect slower adoption 
due to the additional competition it faces both from the commercial scientific graphing 
software and from the commercial statistical software suites. Thomson Reuters ISI 
Web of Science® shows that the R manual [4] has accumulated over 10.000 citations 
since 1999, over 4000 of these in the year 2009 alone. 

The main obstacle preventing its more widespread adoption is the fact that 
students entering the laboratory course usually have no previous experience with 
R, while they usually do have previous experience with spreadsheets. Even though 
spreadsheet users essentially engage in functional programming [38], they generally 
don't perceive it as programming. Experience shows [37^ that the students can not 
be expected to self-learn R even in the case of graduate students of computational 
biology and that a quick introduction course is recommended. The best synergy 
might be achieved when students take an introductory course in statistics before the 
physics laboratory course or simultaneously with it. 

Before we conclude, we would like to mention another weak aspect of the 
spreadsheet paradigm: collaboration in spreadsheet authoring is inherently difficult. 
The changes are introduced to the spreadsheet on the level of spreadsheet cells, and 
a minor change, replicated across hundreds or thousands of cells, renders traditional 
revision control systems practically useless, meaning it very difficult to determine 
who changed what and when did the change occur. R scripts are ASCII text, and 
revision control systems either as basic as RCS [39] or arbitrarily more complex (CVS, 
Subversion, BitKeeper, etc.) can be easily employed. By running scripts in batch mode 
rather than using R interactively, reproducible research techniques [40) are encouraged. 
One further step in this direction is Sweave |41j . which allows creating integrated 
script/text documents. 

4. Conclusions 

In the paper, two methods of data analysis and graphing were compared: spreadsheet 
software, represented by Microsoft Excel, and a statistics software suite, represented 
by GNU R. Both methods were tried on three typical tasks taken from an introductory 
physics laboratory; similar tasks can be, however, encountered in any science 
laboratory course. The tasks encompass linear dependence, non-linear dependence 
and histogram. It has been determined that spreadsheet offers an easier approach 
when the data model is simple (in our case, linear dependence). Beyond a few simple 
models, R offers a more user-friendly solution. Histogram plotting, considered one of 
the basic tools for data analysis, is also inadequately supported in Excel (the same 
is true also for other popular spreadsheet software like OpenOffice Calc and Apple 
Numbers). Using default settings, R produces charts with a higher ratio of data 
conveying information to data conveying no information. 

Spreadsheets are often the only software solution for data analysis students 
encounter during their education. We believe the described shortcomings of 
spreadsheets make it worth exposing them to other existing solutions as well. A 
statistical software suite is a solution which can most easily replace a spreadsheet in 
data analysis and graphing tasks, and in particular their use should be encouraged 
in situations where they can be reused in a statistics course. Among the statistical 
software, we emphasized R. It is free; students can install a copy at home. Using R, 
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well-designed publication-quality plots can be produced with ease. It is an object- 
oriented matrix language, which encourages thinking on a more abstract level. The 
extent of utilization that R has recently experienced in various scientific disciplines 
signifies it is more than a marginal phenomenon, making it more likely that students 
who acquire a certain degree of proficiency with R/S/S-Plus in the course of their 
studies will be able to make use of this skill later in their careers. 

Supplementary material 

Worked-out examples employing spreadsheet are available free of charge on the lOP 
web site. An Excel (version 11; Microsoft Office 2003) spreadsheet supplementl.xls 
contains four worksheets with the solutions for linear regression, two solutions for a 
two-exponential functions ( "2exp log" and "2exp nonlin" ) and a histogram. 
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Appendix A. Linear dependence 

1 concentration <- c(l.e-3, 2.5e-3, 5.e-3, 7.5e-3, l.e-2) 

2 resistance <- c(5.31, 2.66, 1.40, 0.92, 0.78) 
3 

4 # calculate the conductivity 

5 k <- l.e-4*resistance [1] 

6 conductivity <- k/resistance 
7 

8 # define the labels on the x- and y-axis 

9 xlabel <- expression (paste(italic(c) , " [mol/L]")) 

10 ylabel <- expression(paste(sigma, " [(", Omega," cm) "~-l , "] ") ) 
11 

12 # plot the data 

13 plot (concentration, conductivity, 

14 xlab = xlabel, ylab = ylabel, 

15 xlim = c(0, 0.012), ylim = c(0, 8.e-4)) 

16 grid(col = "darkgray") 
17 



Data analysis and graphing 13 

18 # calculate the best-fit line 

19 cond.fit <- lm( conductivity ~ concentration, 

20 data = data.f rame(concentration, conductivity)) 
21 

22 # plot the best-fit line 

23 abline (cond.fit) 

Appendix B. Nonlinear dependence 

1 CI <- read.tableC'capacitor . CSV" , dec=",", sep=";", header = TRUE) 
2 

3 U <- C1$U. .V. 

4 t <- Cl$t. .s. 
5 

6 U.nls <- nls(U ~ A*exp(-t/tl) + B*exp(-t/t2) , 

7 start = list(A = 5, B = 5, tl = 10, t2 = 100)) 
8 

9 plot (CI , xlab = expression(paste(italic(t) , " [s]")), 
10 ylab = expression(paste(italic(U) ," [V]")), xlim = c(0,600)) 

11 

12 lines(0:600, predict (U.nls , list(t = 0:600))) 

Appendix C. Histogram 

1 # detected decays per 10 s 

2 x <- scEmC'decay . txt") 
3 

4 # subtract the base (normalized per 10 s) 

5 X <- X - 2.68 
6 

7 avg <- mean(x) 

8 stdev <- sd(x) 
9 

10 # draw the histogram 

11 hist(x, breaks = seq(f loor (min(x)) , ceiling (max (x) )) , 

12 xlab = "decays / 10 s", ylab = "frequency") 
13 

14 # overlay the probability density function 

15 curve (length (x)*dnorm(x, mean = avg, sd = stdev), add = TRUE) 



